2 (Deep) Neural Networks

2.1 What is a Deep Neural Network?

2.1.1 Remark

Key Ideas:

1) Humans are good at detecting patterns

2) The brain solves a varietty of tasks universally

3) Evolution has already perfected the contruction

2.1.2 Remark (How to build a neuron?)

If the sum of input signals into a neuron surpasses a threshold, the neuron sends a signal.

2.1.3 Definition

An artificial neuron with weights $ \omega_1 ,…, \omega_n \in \mathbb{R} $ and a bias $ b \in \mathbb{R} $ and an activation function (rectifier) $ \sigma: \mathbb{R} \to \mathbb{R} $ is defined as the function $ f: \mathbb{R}^n \to \mathbb{R} $ given by

$$ f(x_1,…,x_n) = \sigma \left( \sum_{i=1}^{n} x_i \omega_i - b \right) \\ = \sigma ( \langle x, \omega \rangle - b ) $$

2.1.4 Examples of activation functions

1) Heaviside function

$$ \sigma(x) = \begin{cases} 1 : x > 0 \\ 0 : x \leq 0 \end{cases} $$

2) Sigmoidal function

$$ \sigma(x) = \frac{1}{1+e^{-x}} $$

3) Rectifiable Linear Unit (ReLU)

$$ \sigma(x) = max \lbrace 0, x \rbrace $$

4) Softmax function

$$ \sigma(x) = ln \left( 1 + e^{x} \right) $$

2.1.5 Remark and Definition

An artificial neural network is a graph which consists of artificial neurons. Special case: A feed forward neural network is a directed acyclic graph of artificial neurons. Other neural networks are called recurrent neural networks.

2.1.6 Definition

Let $ d \in \mathbb{N} $ be the input dimension, $ L \in \mathbb{N} $ the number of layers, $ N_0 = d, N_1, …, N_L \in \mathbb{N} $ the number of neurons in each layer, $ A_l \in \mathbb{R}^{N_l \times N_{l-1}}, l=1…L $ be the weights, $ b_l \in \mathbb{R}^{N_l}, l=1…L $ be the biases, $ \sigma: \mathbb{R} \to \mathbb{R} $ be the activation function. Then

$$ \Phi = \left( \left( A_l, b_l \right) \right)_{l=1}^L $$

is called (the architecture of) a neural network. And the map

$$ \mathcal{R}_{\sigma} \left( \Phi \right): \mathbb{R}^{d} \to \mathbb{R}^{N_l}, $$

$$ \mathcal{R}_{\sigma} \left( \Phi \right)(x) = x_l $$

with

$$ \begin{aligned} x_0 :&= x, \\ x_l :&= \sigma(A_l \cdot x_{l-1}-b_l), l=1…L-1, \\ x_L :&= A_L \cdot x{L-1} - b_L \end{aligned} $$

is called the realization of the neural network with activation function $ \sigma $.

2023-03-15 22:15

ildefons.deSubtitles Suck